Pesquisa | Portal Regional da BVS

1.

HLAEquity: Examining biases in pan-allele peptide-HLA binding predictors.

Conev, Anja; Fasoulis, Romanos; Hall-Swan, Sarah; Ferreira, Rodrigo; Kavraki, Lydia E.

iScience ; 27(1): 108613, 2024 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-38188519

RESUMO

Peptide-HLA (pHLA) binding prediction is essential in screening peptide candidates for personalized peptide vaccines. Machine learning (ML) pHLA binding prediction tools are trained on vast amounts of data and are effective in screening peptide candidates. Most ML models report the ability to generalize to HLA alleles unseen during training ("pan-allele" models). However, the use of datasets with imbalanced allele content raises concerns about biased model performance. First, we examine the data bias of two ML-based pan-allele pHLA binding predictors. We find that the pHLA datasets overrepresent alleles from geographic populations of high-income countries. Second, we show that the identified data bias is perpetuated within ML models, leading to algorithmic bias and subpar performance for alleles expressed in low-income geographic populations. We draw attention to the potential therapeutic consequences of this bias, and we challenge the use of the term "pan-allele" to describe models trained with currently available public datasets.

2.

PepSim: T-cell cross-reactivity prediction via comparison of peptide sequence and peptide-HLA structure.

Hall-Swan, Sarah; Slone, Jared; Rigo, Mauricio M; Antunes, Dinler A; Lizée, Gregory; Kavraki, Lydia E.

Front Immunol ; 14: 1108303, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37187737

RESUMO

Introduction: Peptide-HLA class I (pHLA) complexes on the surface of tumor cells can be targeted by cytotoxic T-cells to eliminate tumors, and this is one of the bases for T-cell-based immunotherapies. However, there exist cases where therapeutic T-cells directed towards tumor pHLA complexes may also recognize pHLAs from healthy normal cells. The process where the same T-cell clone recognizes more than one pHLA is referred to as T-cell cross-reactivity and this process is driven mainly by features that make pHLAs similar to each other. T-cell cross-reactivity prediction is critical for designing T-cell-based cancer immunotherapies that are both effective and safe. Methods: Here we present PepSim, a novel score to predict T-cell cross-reactivity based on the structural and biochemical similarity of pHLAs. Results and discussion: We show our method can accurately separate cross-reactive from non-crossreactive pHLAs in a diverse set of datasets including cancer, viral, and self-peptides. PepSim can be generalized to work on any dataset of class I peptide-HLAs and is freely available as a web server at pepsim.kavrakilab.org.

Assuntos

Peptídeos , Linfócitos T Citotóxicos , Sequência de Aminoácidos , Células Clonais

3.

SARS-Arena: Sequence and Structure-Guided Selection of Conserved Peptides from SARS-related Coronaviruses for Novel Vaccine Development.

Rigo, Mauricio Menegatti; Fasoulis, Romanos; Conev, Anja; Hall-Swan, Sarah; Antunes, Dinler Amaral; Kavraki, Lydia E.

Front Immunol ; 13: 931155, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35903104

RESUMO

The pandemic caused by the SARS-CoV-2 virus, the agent responsible for the COVID-19 disease, has affected millions of people worldwide. There is constant search for new therapies to either prevent or mitigate the disease. Fortunately, we have observed the successful development of multiple vaccines. Most of them are focused on one viral envelope protein, the spike protein. However, such focused approaches may contribute for the rise of new variants, fueled by the constant selection pressure on envelope proteins, and the widespread dispersion of coronaviruses in nature. Therefore, it is important to examine other proteins, preferentially those that are less susceptible to selection pressure, such as the nucleocapsid (N) protein. Even though the N protein is less accessible to humoral response, peptides from its conserved regions can be presented by class I Human Leukocyte Antigen (HLA) molecules, eliciting an immune response mediated by T-cells. Given the increased number of protein sequences deposited in biological databases daily and the N protein conservation among viral strains, computational methods can be leveraged to discover potential new targets for SARS-CoV-2 and SARS-CoV-related viruses. Here we developed SARS-Arena, a user-friendly computational pipeline that can be used by practitioners of different levels of expertise for novel vaccine development. SARS-Arena combines sequence-based methods and structure-based analyses to (i) perform multiple sequence alignment (MSA) of SARS-CoV-related N protein sequences, (ii) recover candidate peptides of different lengths from conserved protein regions, and (iii) model the 3D structure of the conserved peptides in the context of different HLAs. We present two main Jupyter Notebook workflows that can help in the identification of new T-cell targets against SARS-CoV viruses. In fact, in a cross-reactive case study, our workflows identified a conserved N protein peptide (SPRWYFYYL) recognized by CD8+ T-cells in the context of HLA-B7+. SARS-Arena is available at https://github.com/KavrakiLab/SARS-Arena.

Assuntos

COVID-19 , SARS-CoV-2 , Linfócitos T CD8-Positivos , COVID-19/prevenção & controle , Vacinas contra COVID-19 , Epitopos de Linfócito T , Humanos , Peptídeos , Desenvolvimento de Vacinas

4.

DINC-COVID: A webserver for ensemble docking with flexible SARS-CoV-2 proteins.

Hall-Swan, Sarah; Devaurs, Didier; Rigo, Mauricio M; Antunes, Dinler A; Kavraki, Lydia E; Zanatta, Geancarlo.

Comput Biol Med ; 139: 104943, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34717233

RESUMO

An unprecedented research effort has been undertaken in response to the ongoing COVID-19 pandemic. This has included the determination of hundreds of crystallographic structures of SARS-CoV-2 proteins, and numerous virtual screening projects searching large compound libraries for potential drug inhibitors. Unfortunately, these initiatives have had very limited success in producing effective inhibitors against SARS-CoV-2 proteins. A reason might be an often overlooked factor in these computational efforts: receptor flexibility. To address this issue we have implemented a computational tool for ensemble docking with SARS-CoV-2 proteins. We have extracted representative ensembles of protein conformations from the Protein Data Bank and from in silico molecular dynamics simulations. Twelve pre-computed ensembles of SARS-CoV-2 protein conformations have now been made available for ensemble docking via a user-friendly webserver called DINC-COVID (dinc-covid.kavrakilab.org). We have validated DINC-COVID using data on tested inhibitors of two SARS-CoV-2 proteins, obtaining good correlations between docking-derived binding energies and experimentally-determined binding affinities. Some of the best results have been obtained on a dataset of large ligands resolved via room temperature crystallography, and therefore capturing alternative receptor conformations. In addition, we have shown that the ensembles available in DINC-COVID capture different ranges of receptor flexibility, and that this diversity is useful in finding alternative binding modes of ligands. Overall, our work highlights the importance of accounting for receptor flexibility in docking studies, and provides a platform for the identification of new inhibitors against SARS-CoV-2 proteins.

5.

DINC-COVID: A webserver for ensemble docking with flexible SARS-CoV-2 proteins.

Hall-Swan, Sarah; Antunes, Dinler A; Devaurs, Didier; Rigo, Mauricio M; Kavraki, Lydia E; Zanatta, Geancarlo.

bioRxiv ; 2021 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-33501448

RESUMO

MOTIVATION: Recent efforts to computationally identify inhibitors for SARS-CoV-2 proteins have largely ignored the issue of receptor flexibility. We have implemented a computational tool for ensemble docking with the SARS-CoV-2 proteins, including the main protease (Mpro), papain-like protease (PLpro) and RNA-dependent RNA polymerase (RdRp). RESULTS: Ensembles of other SARS-CoV-2 proteins are being prepared and made available through a user-friendly docking interface. Plausible binding modes between conformations of a selected ensemble and an uploaded ligand are generated by DINC, our parallelized meta-docking tool. Binding modes are scored with three scoring functions, and account for the flexibility of both the ligand and receptor. Additional details on our methods are provided in the supplementary material. AVAILABILITY: dinc-covid.kavrakilab.org. SUPPLEMENTARY INFORMATION: Details on methods for ensemble generation and docking are provided as supplementary data online. CONTACT: geancarlo.zanatta@ufc.br , kavraki@rice.edu.

6.

HLA-Arena: A Customizable Environment for the Structural Modeling and Analysis of Peptide-HLA Complexes for Cancer Immunotherapy.

Antunes, Dinler A; Abella, Jayvee R; Hall-Swan, Sarah; Devaurs, Didier; Conev, Anja; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.

JCO Clin Cancer Inform ; 4: 623-636, 2020 07.

Artigo em Inglês | MEDLINE | ID: mdl-32667823

RESUMO

PURPOSE: HLA protein receptors play a key role in cellular immunity. They bind intracellular peptides and display them for recognition by T-cell lymphocytes. Because T-cell activation is partially driven by structural features of these peptide-HLA complexes, their structural modeling and analysis are becoming central components of cancer immunotherapy projects. Unfortunately, this kind of analysis is limited by the small number of experimentally determined structures of peptide-HLA complexes. Overcoming this limitation requires developing novel computational methods to model and analyze peptide-HLA structures. METHODS: Here we describe a new platform for the structural modeling and analysis of peptide-HLA complexes, called HLA-Arena, which we have implemented using Jupyter Notebook and Docker. It is a customizable environment that facilitates the use of computational tools, such as APE-Gen and DINC, which we have previously applied to peptide-HLA complexes. By integrating other commonly used tools, such as MODELLER and MHCflurry, this environment includes support for diverse tasks in structural modeling, analysis, and visualization. RESULTS: To illustrate the capabilities of HLA-Arena, we describe 3 example workflows applied to peptide-HLA complexes. Leveraging the strengths of our tools, DINC and APE-Gen, the first 2 workflows show how to perform geometry prediction for peptide-HLA complexes and structure-based binding prediction, respectively. The third workflow presents an example of large-scale virtual screening of peptides for multiple HLA alleles. CONCLUSION: These workflows illustrate the potential benefits of HLA-Arena for the structural modeling and analysis of peptide-HLA complexes. Because HLA-Arena can easily be integrated within larger computational pipelines, we expect its potential impact to vastly increase. For instance, it could be used to conduct structural analyses for personalized cancer immunotherapy, neoantigen discovery, or vaccine development.

Assuntos

Neoplasias , Peptídeos , Humanos , Imunoterapia , Neoplasias/terapia , Linfócitos T

7.

Using parallelized incremental meta-docking can solve the conformational sampling issue when docking large ligands to proteins.

Devaurs, Didier; Antunes, Dinler A; Hall-Swan, Sarah; Mitchell, Nicole; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.

BMC Mol Cell Biol ; 20(1): 42, 2019 09 05.

Artigo em Inglês | MEDLINE | ID: mdl-31488048

RESUMO

BACKGROUND: Docking large ligands, and especially peptides, to protein receptors is still considered a challenge in computational structural biology. Besides the issue of accurately scoring the binding modes of a protein-ligand complex produced by a molecular docking tool, the conformational sampling of a large ligand is also often considered a challenge because of its underlying combinatorial complexity. In this study, we evaluate the impact of using parallelized and incremental paradigms on the accuracy and performance of conformational sampling when docking large ligands. We use five datasets of protein-ligand complexes involving ligands that could not be accurately docked by classical protein-ligand docking tools in previous similar studies. RESULTS: Our computational evaluation shows that simply increasing the amount of conformational sampling performed by a protein-ligand docking tool, such as Vina, by running it for longer is rarely beneficial. Instead, it is more efficient and advantageous to run several short instances of this docking tool in parallel and group their results together, in a straightforward parallelized docking protocol. Even greater accuracy and efficiency are achieved by our parallelized incremental meta-docking tool, DINC, showing the additional benefits of its incremental paradigm. Using DINC, we could accurately reproduce the vast majority of the protein-ligand complexes we considered. CONCLUSIONS: Our study suggests that, even when trying to dock large ligands to proteins, the conformational sampling of the ligand should no longer be considered an issue, as simple docking protocols using existing tools can solve it. Therefore, scoring should currently be regarded as the biggest unmet challenge in molecular docking.

Assuntos

Algoritmos , Simulação de Acoplamento Molecular , Proteínas/química , Bases de Dados de Proteínas , Ligantes , Peptídeos/química , Conformação Proteica

8.

Retraction Note: detangling PPI networks to uncover functionally meaningful clusters.

Hall-Swan, Sarah; Crawford, Jake; Newman, Rebecca; Cowen, Lenore J.

BMC Syst Biol ; 12(1): 113, 2018 11 19.

Artigo em Inglês | MEDLINE | ID: mdl-30453938

RESUMO

The authors have retracted this article [1]. After publication they discovered a technical error in the Louvain algorithm with bounded cluster sizes. Correction of this error substantially changed the results for this algorithm and the conclusions drawn in the article were found to be incorrect. The authors will submit a new manuscript for peer review.

9.

Detangling PPI networks to uncover functionally meaningful clusters.

Hall-Swan, Sarah; Crawford, Jake; Newman, Rebecca; Cowen, Lenore J.

BMC Syst Biol ; 12(Suppl 3): 24, 2018 03 21.

Artigo em Inglês | MEDLINE | ID: mdl-29589565

RESUMO

BACKGROUND: Decomposing a protein-protein interaction network (PPI network) into non-overlapping clusters or communities, sometimes called "network modules," is an important way to explore functional roles of sets of genes. When the method to accomplish this decomposition is solely based on purely graph-theoretic measures of the interconnection structure of the network, this is often called unsupervised clustering or community detection. In this study, we compare unsupervised computational methods for decomposing a PPI network into non-overlapping modules. A method is preferred if it results in a large proportion of nodes being assigned to functionally meaningful modules, as measured by functional enrichment over terms from the Gene Ontology (GO). RESULTS: We compare the performance of three popular community detection algorithms with the same algorithms run after the network is pre-processed by removing and reweighting based on the diffusion state distance (DSD) between pairs of nodes in the network. We call this "detangling" the network. In almost all cases, we find that detangling the network based on the DSD distance reweighting provides more meaningful clusters. CONCLUSIONS: Re-embedding using the DSD distance metric, before applying standard community detection algorithms, can assist in uncovering GO functionally enriched clusters in the yeast PPI network.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA